: Three Approaches to GO-Tagging Biomedical Abstracts

نویسندگان

  • Neil Davis
  • Henk Harkema
  • Robert J. Gaizauskas
  • Yikun Guo
  • Moustafa Ghanem
  • Tom Barnwell
  • Yike Guo
  • Jon Ratcliffe
چکیده

In this paper we explore three approaches to assigning Gene Ontology semantic classifications to abstracts from the PubMed database: lexical lookup, information retrieval and machine learning. To evaluate the approaches we use two “gold” standards derived from the yeast genome database (SGD). While evaluation provides insights into the three approaches, it also reveals the difficulties in constructing a suitable gold standard for this task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identifying Experimental Techniques in Biomedical Literature

Named entity recognition of gene names, protein names, cell-lines, and other biologically relevant concepts has received significant attention by the research community. In this work, we considered named entity recognition of experimental techniques in biomedical articles. In our system to mine gene and disease associations, each association is categorized by the techniques used to derive the a...

متن کامل

Tagging gene and protein names in biomedical text

MOTIVATION The MEDLINE database of biomedical abstracts contains scientific knowledge about thousands of interacting genes and proteins. Automated text processing can aid in the comprehension and synthesis of this valuable information. The fundamental task of identifying gene and protein names is a necessary first step towards making full use of the information encoded in biomedical text. This ...

متن کامل

Tagging gene and protein names in full text articles

Current information extraction efforts in the biomedical domain tend to focus on finding entities and facts in structured databases or MEDLINE abstracts. We apply a gene and protein name tagger trained on Medline abstracts (ABGene) to a randomly selected set of full text journal articles in the biomedical domain. We show the effect of adaptations made in response to the greater heterogeneity o...

متن کامل

Gene Ontology (GO) Annotation in Biomedical Literature

In this paper, we propose an approach for doing Gene Ontology (GO) annotation on biomedical texts. The GO is an effort to create a controlled terminology for labelling gene functions in a more precise. Our system is based on the application of Parametrized Finite-State Graphs (P-FSG) for GO tagging. This process was implemented to the annotation of genes related with Alzehimer disease. This pro...

متن کامل

Functional gene clustering via gene annotation sentences, MeSH and GO keywords from biomedical literature

Gene function annotation remains a key challenge in modern biology. This is especially true for high-throughput techniques such as gene expression experiments. Vital information about genes is available electronically from biomedical literature in the form of full texts and abstracts. In addition, various publicly available databases (such as GenBank, Gene Ontology and Entrez) provide access to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006